1,099 research outputs found

    Boosting with stumps for predicting transcription start sites

    Get PDF
    Promoter prediction is a difficult but important problem in gene finding, and it is critical for elucidating the regulation of gene expression. We introduce a new promoter prediction program, CoreBoost, which applies a boosting technique with stumps to select important small-scale as well as large-scale features. CoreBoost improves greatly on locating transcription start sites. We also demonstrate that by further utilizing some tissue-specific information, better accuracy can be achieved

    Using quality scores and longer reads improves accuracy of Solexa read mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Second-generation sequencing has the potential to revolutionize genomics and impact all areas of biomedical science. New technologies will make re-sequencing widely available for such applications as identifying genome variations or interrogating the oligonucleotide content of a large sample (<it>e.g</it>. ChIP-sequencing). The increase in speed, sensitivity and availability of sequencing technology brings demand for advances in computational technology to perform associated analysis tasks. The Solexa/Illumina 1G sequencer can produce tens of millions of reads, ranging in length from ~25–50 nt, in a single experiment. Accurately mapping the reads back to a reference genome is a critical task in almost all applications. Two sources of information that are often ignored when mapping reads from the Solexa technology are the 3' ends of longer reads, which contain a much higher frequency of sequencing errors, and the base-call quality scores.</p> <p>Results</p> <p>To investigate whether these sources of information can be used to improve accuracy when mapping reads, we developed the RMAP tool, which can map reads having a wide range of lengths and allows base-call quality scores to determine which positions in each read are more important when mapping. We applied RMAP to analyze data re-sequenced from two human BAC regions for varying read lengths, and varying criteria for use of quality scores. RMAP is freely available for downloading at <url>http://rulai.cshl.edu/rmap/</url>.</p> <p>Conclusion</p> <p>Our results indicate that significant gains in Solexa read mapping performance can be achieved by considering the information in 3' ends of longer reads, and appropriately using the base-call quality scores. The RMAP tool we have developed will enable researchers to effectively exploit this information in targeted re-sequencing projects.</p

    TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies

    Get PDF
    In order to understand gene regulation, accurate and comprehensive knowledge of transcriptional regulatory elements is essential. Here, we report our efforts in building a mammalian Transcriptional Regulatory Element Database (TRED) with associated data analysis functions. It collects cis- and trans-regulatory elements and is dedicated to easy data access and analysis for both single-gene-based and genome-scale studies. Distinguishing features of TRED include: (i) relatively complete genome-wide promoter annotation for human, mouse and rat; (ii) availability of gene transcriptional regulation information including transcription factor binding sites and experimental evidence; (iii) data accuracy is ensured by hand curation; (iv) efficient user interface for easy and flexible data retrieval; and (v) implementation of on-the-fly sequence analysis tools. TRED can provide good training datasets for further genome-wide cis-regulatory element prediction and annotation, assist detailed functional studies and facilitate the decipher of gene regulatory networks (http://rulai.cshl.edu/TRED)

    Spectroscopic study of light scattering in linear alkylbenzene for liquid scintillator neutrino detectors

    Full text link
    We has set up a light scattering spectrometer to study the depolarization of light scattering in linear alkylbenzene. From the scattering spectra it can be unambiguously shown that the depolarized part of light scattering belongs to Rayleigh scattering. The additional depolarized Rayleigh scattering can make the effective transparency of linear alkylbenzene much better than it was expected. Therefore sufficient scintillation photons can transmit through the large liquid scintillator detector of JUNO. Our study is crucial to achieving the unprecedented energy resolution 3\%/E(MeV)\sqrt{E\mathrm{(MeV)}} for JUNO experiment to determine the neutrino mass hierarchy. The spectroscopic method can also be used to judge the attribution of the depolarization of other organic solvents used in neutrino experiments.Comment: 6 pages, 5 figure

    Genome-wide promoter extraction and analysis in human, mouse, and rat

    Get PDF
    Large-scale and high-throughput genomics research needs reliable and comprehensive genome-wide promoter annotation resources. We have conducted a systematic investigation on how to improve mammalian promoter prediction by incorporating both transcript and conservation information. This enabled us to build a better multispecies promoter annotation pipeline and hence to create CSHLmpd (Cold Spring Harbor Laboratory Mammalian Promoter Database) for the biomedical research community, which can act as a starting reference system for more refined functional annotations
    corecore